drmr: A Bayesian approach to Dynamic Range Models in R

Available at lcgodoy.me/slides/2025-esa/

Lucas da Cunha Godoy

EEB Department, UCSC

2025-07-28

Introduction

The Challenge & Limits of SDMs

  • Critical Challenge: Predicting species’ responses to global environmental change is vital for conservation and management.

  • Usual Tool: Species Distribution Models (SDMs) have been the workhorse, correlating occurrences with environmental variables.

  • Limitations of SDMs:

    • Struggle to predict responses under novel future conditions (Pagel and Schurr 2012).
    • Equilibrium Assumption: Often violated (Guisan and Thuiller 2005).
    • Lack Mechanism: They do not explicitly model the underlying biological processes.

Dynamic Range Models (DRMs): A Mechanistic Approach

  • DRMs explicitly incorporate demographic processes that drive range dynamics (Pagel and Schurr 2012).

  • Allows linking environmental drivers directly to specific processes.

  • Potential for more robust forecasting under novel conditions.

  • Why are they rarely in practice? Despite conceptual appeal, DRMs have been underutilized due to their complexity and computational fitting challenges.

Introducing drmr: Making DRMs Accessible

  • Goal: Bridge the gap between DRM potential and practical application.

  • Key Features:

    • Develops and applies age-structured DRMs in a Bayesian framework.
    • User-friendly interface.
    • Leverages Stan via cmdstanr for efficient fitting (Gabry et al. 2024).
    • Easily relate environmental drivers to specific recruitment and survival.

Age-structured DRM

Setup

  • \(Y_{a, t, i}\): Unobserved density of individuals of age \(a\), time \(t\) and site \(i\).

  • \(\lambda_{a, t, i}\): Expected age-specific density.

  • \(Y_{t, i} = \sum_{a} Y_{a, t, i}\): Observed density of individuals of all ages at time \(t\) and site \(i\).

  • \(\mu_{t, i} = \sum_{a} \lambda_{a, t, i}\): Expected “overall” density.

  • \(\rho_{t, i}\): Probability of absence (i.e., \(\mathrm{P}(Y_{t, i} = 0)\)).

  • Biological processes are encoded through \(\lambda_{a, t, i}\).

  • Key processes: recruitment, survival, and movement

Modeling assumption and zero-augmented probability density functions (pdf)

\[ f(y_{t, i} \mid \mu_{t, i}, \phi, \rho_{t, i}) = \begin{cases} \rho_{t, i}, & \text{ if } y_{t, i} = 0, \\ (1 - \rho_{t, i}) g(y_{t, i} \mid \mu_{t, i}, \phi), & \text{ if } y_{t, i} > 0, \end{cases} \]

  • \(g(\cdot \mid \mu_{t, i}, \phi)\) is the pdf of a continuous probability distribution.

  • \(\mu_{t, i}\) is the expected value (theoretical mean) of \(Y_{t, i}\) provided the species is present at time \(t\) and site \(i\).

  • \(\phi\) is a nuisance parameter.

What you are about to see

Modeling assumptions diagram

flowchart TD
    A["$$\lambda_{1, t, i} = \exp \{ \mathbf{\beta}^{\top}_r \mathbf{x}^{(r)}_{t, i} + z^{(r)}_{t, i} \} $$"]
    B["$$\lambda_{a, t, i} = \lambda_{a - 1, t - 1, i} \cdot s_{a - 1, t - 1, i} $$"]
    C(("$$\mu_{t, i}$$"))
    D("$$\mathrm{logit}(\rho_{t, i}) = \boldsymbol{\beta}^{\top}_{t} \mathbf{x}^{(t)}_{t, i} $$")
    E["$$\log(s_{a, t, i}) = \boldsymbol{\beta}^{\top}_s \mathbf{x}^{(s)}_{t, i} + z^{(s)}_{t, i} - f_{a, t}$$"]
    F{"$$Y_{t, i}$$"}
    G(("$$\phi$$"))
    A --->B
    E --->B
    B-- $$\sum_a$$ -->C
    C-.->F
    D-.->F
    G-.->F

Diagram v2

graph TD
    subgraph "Demographic processes"
        %% Inputs and Parameters for lambda1
        xr["$$\mathbf{x}^{(r)}_{t, i}$$"] --> lambda1(("$$\lambda_{1, t, i}$$"));
        zr(("$$z^{(r)}_{t, i}$$")) --> lambda1;
        betar(("$$\boldsymbol{\beta}_r$$")) --> lambda1;

        %% Recursive part for lambda_a
        subgraph "Time t-1"
            xs["$$\mathbf{x}^{(s)}_{t - 1, i}$$"] --> s_prev;
            zs(("$$z^{(s)}_{t - 1, i}$$")) --> s_prev;
            betas(("$$\boldsymbol{\beta}_s$$")) --> s_prev;
            f["$$f_{a - 1, t - 1}$$"] --> s_prev;
            lambda_prev("$$\lambda_{a - 1, t - 1, i}$$")
            s_prev("$$s_{a - 1, t - 1, i}$$")
        end
        
        lambda_prev --> lambda_a(("$$\lambda_{a, t, i}$$"));
        s_prev --> lambda_a;
    end

    %% Mean calculation
    lambda_a --> mu(("$$\mu_{t, i}$$"));

subgraph "Zero-augmentation"
        xt["$$\mathbf{x}^{(t)}_{t, i}$$"] --> rho(("$$\rho_{t, i}$$"));
        betat(("$$\boldsymbol{\beta}_t$$")) --> rho;
    end

    %% Final Observation
    mu --> Y{"$$Y_{t, i}$$"};

    %% Other components
    phi(("$$\phi$$")) --> Y;
    rho --> Y;

Code & workflow

Illustration with real data

Summer Flounder Dataset

  • An example analysis uses Summer flounder (Paralichthys dentatus) data from 1982-2016 NOAA bottom trawl surveys (Fredston et al. 2025) to illustrate the package’s features.

  • The data spans the US Atlantic coast (Cape Hatteras, NC to the Canada/Maine border) and was aggregated from individual hauls into 10 latitudinal patches with varying areas.

  • Response variable: Density (count per unit area).

  • Environmental drivers: SST and SBT.

Patches/sites

Models fitted to the data

Model Description
DRM (rec) Recruitment as a quadratic function of SST, AR(1) random effect for recruitment, Fishing mortality, Movement, Absence probability as a function of effort (number of hauls) and estimated density
DRM (surv) Surival as a quadratic function of SBT, AR(1) random effect for recruitment, Fishing mortality, Movement, Absence probability as a function of effort (number of hauls) and estimated density
GLM-SDM A zero-augmented LogNormal SDM with the absence probability as a function of the number of hauls (a measure of effort).

Code for DRM (rec)

drm_rec <-
  fit_drm(.data = dat_train,
          y_col = "dens", ## response variable: density
          time_col = "year", ## vector of time points
          site_col = "patch",
          family = "gamma", ## other options: lognormal, loglogistic
          seed = 202505,
          formula_zero = ~ 1 + c_hauls,
          formula_rec = ~ 1 + c_sst + I(c_sst * c_sst),
          formula_surv = ~ 1,
          f_mort = f_train,
          n_ages = NROW(f_train),
          adj_mat = adj_mat, ## A matrix for movement routine
          ages_movement = c(0, 0,
                            rep(1, 12),
                            0, 0), ## ages allowed to move
          .toggles = list(ar_re = "rec",
                          movement = 1,
                          est_surv = 1,
                          est_init = 0,
                          minit = 1),
          .priors = list(pr_phi_a = 1, pr_phi_b = .1,
                         pr_alpha_a = 4.2, pr_alpha_b = 5.8,
                         pr_zeta_a = 7, pr_zeta_b = 3))

Code for DRM (rec)

drm_surv <-
  fit_drm(.data = dat_train,
          y_col = "dens", ## response variable: density
          time_col = "year", ## vector of time points
          site_col = "patch",
          family = "gamma", ## other options: lognormal, loglogistic
          seed = 202505,
          formula_zero = ~ 1 + c_hauls,
          formula_rec = ~ 1,
          formula_surv = ~ 1 + c_sbt + I(c_sst * c_sbt),
          f_mort = f_train,
          n_ages = NROW(f_train),
          adj_mat = adj_mat, ## A matrix for movement routine
          ages_movement = c(0, 0,
                            rep(1, 12),
                            0, 0), ## ages allowed to move
          .toggles = list(ar_re = "rec", ## other options: "surv", "dens"
                          movement = 1,
                          est_surv = 1,
                          est_init = 0,
                          minit = 1),
          .priors = list(pr_phi_a = 1, pr_phi_b = .1,
                         pr_alpha_a = 4.2, pr_alpha_b = 5.8,
                         pr_zeta_a = 7, pr_zeta_b = 3))

Effect of environmental variables on specific processes

  • Recruitment and SST:
newdata_rec <- data.frame(c_sst =
                            seq(from = quantile(dat_train$c_sst, .05),
                                to = quantile(dat_train$c_sst, .95),
                                length.out = 200))

rec_samples <- marg_rec(drm_rec, newdata_rec)
  • Survival and SBT:
newdata_surv <- data.frame(c_sbt =
                             seq(from = quantile(dat_train$c_sbt, .05),
                                 to = quantile(dat_train$c_sbt, .95),
                                 length.out = 200))

surv_samples <- marg_surv(drm_surv, newdata_surv)

Effect of environmental variables

Forecasting

  • The code for obtaining out-of-sample predictions is quite intuitive.

  • We only need to provide the output for a model fit, a new dataset (where we want the predictions), and a seed:

forecast_rec <- predict_drm(drm = drm_rec,
                            new_data = dat_test,
                            past_data = filter(dat_train,
                                               year == max(year)),
                            seed = 125,
                            cores = 4)

Forecasts visualization

Comparing models

Model RMSE IS PIC
DRM (rec) 0.08 0.74 90.0
DRM (surv) 0.10 1.01 96.0
GLM-SDM 0.14 1.18 74.0

Concluding remarks

Highlights

  • The drmr substantially lowers the barrier for ecologists to use the DRM in their applications.

  • The code is easy to use and takes advantage of what has been developed for Stan: visualization, diagnostic tools, and estimation.

  • drmr allows for empirically testing which processes are more important to predict the distribution of a species.

  • The more complex a model is, the more (and better) data we need to be able to estimate those relationships.

Future work

  • Include population dynamic models that explicitly relate adult density to recruitment (e.g., Ricker, Belverton-Holt)

  • GAM-like non-parametric relationships between processes and the environment

  • Support for length-composition data

  • More realistic movement routines

Acknowledgements

  • Malin, Alexa, Jude, Jeewantha, Mark, and many others.

References

Fredston, A., Ovando, D., Godoy, L. da C., Kong, J., Muffley, B., Thorson, J. T., and Pinsky, M. (2025), “Dynamic range models improve the near-term forecast for a marine species on the move,” Ecology Letters, 00, (under review). https://doi.org/10.32942/X24D00.
Gabry, J., Češnovar, R., Johnson, A., and Bronder, S. (2024), cmdstanr: R interface to CmdStan.
Guisan, A., and Thuiller, W. (2005), “Predicting species distribution: Offering more than simple habitat models,” Ecology letters, Wiley Online Library, 8, 993–1009. https://doi.org/https://doi.org/10.1111/j.1461-0248.2005.00792.x.
Pagel, J., and Schurr, F. M. (2012), “Forecasting species ranges by statistical estimation of ecological niches and spatial population dynamics,” Global Ecology and Biogeography, 21, 293–304. https://doi.org/https://doi.org/10.1111/j.1466-8238.2011.00663.x.